Building Large Machine Reading-Comprehension Datasets using Paragraph Vectors
نویسندگان
چکیده
We present a dual contribution to the task of machine reading-comprehension: a technique for creating large-sized machine-comprehension (MC) datasets using paragraph-vector models; and a novel, hybrid neural-network architecture that combines the representation power of recurrent neural networks with the discriminative power of fully-connected multi-layered networks. We use the MC-dataset generation technique to build a dataset of around 2 million examples, for which we empirically determine the high-ceiling of human performance (around 91% accuracy), as well as the performance of a variety of computer models. Among all the models we have experimented with, our hybrid neuralnetwork architecture achieves the highest performance (83.2% accuracy). The remaining gap to the human-performance ceiling provides enough room for future model improvements.
منابع مشابه
Dataset for the First Evaluation on Chinese Machine Reading Comprehension
Machine Reading Comprehension (MRC) has become enormously popular recently and has attracted a lot of attentions. However, existing reading comprehension datasets are mostly in English. To add diversity in reading comprehension datasets, in this paper we propose a new Chinese reading comprehension dataset for accelerating related research in the community. The proposed dataset contains two diff...
متن کاملConstructing Datasets for Multi-hop Reading Comprehension Across Documents
Most Reading Comprehension methods limit themselves to queries which can be answered using a single sentence, paragraph, or document. Enabling models to combine disjoint pieces of textual evidence would extend the scope of machine comprehension methods, but currently there exist no resources to train and test this capability. We propose a novel task to encourage the development of models for te...
متن کاملSimple and Effective Multi-Paragraph Reading Comprehension
We consider the problem of adapting neural paragraph-level question answering models to the case where entire documents are given as input. Our proposed solution trains models to produce well calibrated confidence scores for their results on individual paragraphs. We sample multiple paragraphs from the documents during training, and use a sharednormalization training objective that encourages t...
متن کاملStart and End Interactions in Bidirectional Attention Flow for Reading Comprehension
The reading comprehension machine learning task involves reading in a question and returning an answer from an associated context paragraph. This task has proven to be difficult, as the performance of state-of-the-art models still do not compare with human performance. The difficulty of the tasks comes from understanding two separate pieces of information as well as the relationship between the...
متن کاملAssignment 4: Reading Comprehension
Reading comprehension is the task of understanding a piece of text by a machine. We train an end-to-end neural network that models the conditional distribution of start and end indices, given the question and context paragraph. We build on top of the baseline suggested in the Assignment, and explore new models to implement attention. We also measure the performance of the models and analyse the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1612.04342 شماره
صفحات -
تاریخ انتشار 2016